-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlstats: include regions in statement_statistics #95449
Conversation
Test failure looks like #96149, possibly related to the unified SQL address resolver? |
Part of #89949. This completes the work abandoned in #92259, properly storing the regions in which a statement was executed, rather than waiting until UI render time to try to map (dangerously ephemeral) node IDs to their respective regions. With this work in place, we will be able to remove both the UI mapping code (usages of this [selector][]) and the stop-gap work of #93268. [selector]: https://github.com/cockroachdb/cockroach/blob/0f6333cbd6c78264a59dc435324c0c33d75ea148/pkg/ui/workspaces/cluster-ui/src/store/nodes/nodes.selectors.ts#L52-L64 Release note (sql change): A regions field was added to the statistics column of crdb_internal.statement_statistics, reporting the regions of the nodes on which the statement was executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a small request, otherwise
Reviewed 19 of 19 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @matthewtodd and @yuzefovich)
-- commits
line 12 at r1:
can you create an issue to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @matthewtodd and @yuzefovich)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @maryliag and @yuzefovich)
Previously, maryliag (Marylia Gutierrez) wrote…
can you create an issue to track this?
Done! #98056 and #98057. I'll tackle the first one early this week.
bors r+ |
Build succeeded: |
97138: ui: add error code to stmt and txn insights details pages r=gtr a=gtr Part of: #87785. Previously, the stmt and txn insights details pages did not show any further information for failed executions. This commit adds an "error code" column to the insights table for a failed execution in the stmt and txn insights details pages. Additionally, a "status" column was added to the stmt and txn workload insights tables which is either "Completed" or "Failed". Future work involves adding the error message string in addition to the error code but it needs to be redacted first. Additionally, the txn status is missing the implementation of a "Cancelled" status. Note to reviewers: only consider the second commit, as the first is required to get the txn status. - Loom [demo](https://www.loom.com/share/e82b97ff9f034d82b98640170eb54408). Release note (ui change): Adds error code column to the insights table for a failed execution in the stmt and txn insights details page. Adds status column to the stmt and txn workload insights tables. 98410: cluster-ui: tenants use sqlstats-supplied regions r=matthewtodd a=matthewtodd Fixes #98056. As of #95449, the SQL Activity pages in the DB Console can draw regions information directly from the sqlstats tables, rather than having to translate node IDs to regions on page load. Here, we make that switch, but for non-system tenants only, because: 1. The ephemeral nature of serverless nodes made this view-time mapping especially problematic in that context. (See further notes in #95449.) 2. The system-tenant views also include KV node IDs in a special Regions/Nodes column, which we are unable to recreate given the backend storage structure. (Future design work might suggest removing these node IDs altogether, for a unified UI.) # Screenshots! ## Statements, with and without regions filter <img width="1372" alt="statements" src="https://user-images.githubusercontent.com/5261/225033247-739df90a-9173-4aab-a666-a61a1ceeb579.png"> <img width="1372" alt="statements - filtered" src="https://user-images.githubusercontent.com/5261/225033271-1c0d0f82-3dd4-48ea-bdef-11f19af97a85.png"> ## Statement details <img width="1372" alt="statement details" src="https://user-images.githubusercontent.com/5261/225033338-6dff4a6e-a4a3-48c6-863a-84f1375b0a61.png"> ## Transactions, with and without regions filter <img width="1372" alt="transactions" src="https://user-images.githubusercontent.com/5261/225033366-65f44e95-3549-47cc-b0f2-67ad48a1a1fa.png"> <img width="1372" alt="transactions - filtered" src="https://user-images.githubusercontent.com/5261/225033391-50b9a2dc-e9a1-457b-84b1-837426eba35e.png"> ## Transaction details <img width="1372" alt="transaction details" src="https://user-images.githubusercontent.com/5261/225033505-3fdeceef-35dc-4e06-af25-ab4d0c53518f.png"> Release note: None 98464: jobs,upgrades: add migration to backfill job_info table r=dt a=adityamaru This change adds a migration and corresponding cluster version after which every job entry in the system.jobs table will have its Payload and Progress written to two rows in the system.job_info table. Informs: #97762 Release note: None 98510: backupccl: update restore/nodeshutdown tests to use new roachtest framework r=adityamaru a=msbutler The restore/nodeshutdown tests have been using a very old workload that will not be restorable when #93804 lands. This patch changes the restore/nodeshutdown workload to a 80GB tpce restore and moves the tests to run on aws instead of gcp. Release note: None Epic: None 98579: upgrade/upgrades: skip TestUpgradeSchemaChangerElements r=smg260 a=smg260 Refs: #98062 Reason: flaky test Generated by bin/skip-test. Release justification: non-production code changes Release note: None Epic: None Co-authored-by: gtr <[email protected]> Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: adityamaru <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Miral Gadani <[email protected]>
96991: roachtest: update mixed-version backup to use new framework r=srosenberg a=renatolabs This updates the `backup/mixed-version` roachtest to use the recently introduced mixed-version roachtest framework (`mixedversion` package). The main behavior exercised remains the same: backups are taken in mixed-binary state, and those backups are restored and verified at the end of the test. However, this commit also improves the coverage of mixed-version backup testing in a few ways: * **Randomization**. By virtue of using the new framework, most runs will be different from one another since the order of actions taken by the test will be different. Previously, backups would always be taken with 2 nodes in the old version and 2 nodes in the new version. Now, backups can be taken when an arbitrary number of nodes is running the new version. As a consequence, it's also possible that some executions will attempt backups when all nodes are running a new binary version, but the cluster version itself has not been updated. Other points of new randomization include the choice of the node's external dir where backups are stored, which node to connect to when running certain statements, and how much to wait between backups. * **Backup Options**. Backups will randomly be created with `revision_history` enabled, or with an `encryption_passphrase`. * **Downgrades**. The cluster is also downgraded in mixed-version tests. No downgrades happened in that test before this commit. * **Workload**. Instead of using fixed call to `generate_series` to generate data between backups, the test now runs the `bank` workload continuously during the test. A random wait between backups allows the workload to make changes to the underlying table during the test and for the backups to be taken while writes are taking place. * **Finalization**: the test _may_ attempt to create a backup as the upgrade is finalizing (i.e., migrations are running and cluster version is advancing). In addition, this test will also see improved coverage as we make more improvements to test plans generated by the `mixedversion` package. These changes will create more backup scenarios in the future without requiring any code changes to this test. This test has already helped us uncover one backup bug (#97953). Epic: CRDB-19321 Release note: None 98398: statusccl: stop serving /_status/nodes to tenants r=matthewtodd a=matthewtodd Fixes #98057. This reverts the work of #93268, which is no longer necessary now that we are eagerly capturing region information at execution time in #95449. Release note: None Co-authored-by: Renato Costa <[email protected]> Co-authored-by: Matthew Todd <[email protected]>
Part of #89949.
This completes the work abandoned in #92259, properly storing the regions in which a statement was executed, rather than waiting until UI render time to try to map (dangerously ephemeral) node IDs to their respective regions.
With this work in place, we will be able to remove both the UI mapping code (usages of this selector) and the stop-gap work of #93268.
Release note (sql change): A regions field was added to the statistics column of crdb_internal.statement_statistics, reporting the regions of the nodes on which the statement was executed.